Delhi Metro Network Analysis¶
Metro Network Analysis involves examining the network of metro systems to understand their structure, efficiency, and effectiveness. It typically includes analyzing routes, stations, traffic, connectivity, and other operational aspects.
Delhi Metro Network Analysis: Process We Can Follow¶
Analyzing the metro network in a city like Delhi helps improve urban transportation infrastructure, leading to better city planning and enhanced commuter experiences. Below is the process we can follow for the task of Metro Network Analysis of Delhi:
- Determine what you want to achieve. It could be optimizing routes, reducing congestion, improving passenger flow, or understanding travel patterns.
- Collect data on metro lines, stations, connections, and transit schedules.
- Clean the data for inconsistencies, missing values, or errors.
- Create visual representations of the network, such as route maps, passenger flow charts, or heat maps of station congestion.
- Analyze how effectively the network handles passenger traffic and meets operational targets.
Let’s get started with the task of Delhi Metro Network Analysis by importing the necessary Python libraries and the dataset:
pip install folium
Collecting foliumNote: you may need to restart the kernel to use updated packages. Obtaining dependency information for folium from https://files.pythonhosted.org/packages/ae/6d/18a7546e1748ecdd6ed7cd00d3f183faf1df08bd4f5e5e0eb3e72458b862/folium-0.17.0-py2.py3-none-any.whl.metadata Downloading folium-0.17.0-py2.py3-none-any.whl.metadata (3.8 kB) Collecting branca>=0.6.0 (from folium) Obtaining dependency information for branca>=0.6.0 from https://files.pythonhosted.org/packages/75/ca/6074ab4a04dd1a503201c18091b3426f3709670115fae316907a97f98d75/branca-0.7.2-py3-none-any.whl.metadata Downloading branca-0.7.2-py3-none-any.whl.metadata (1.5 kB) Requirement already satisfied: jinja2>=2.9 in c:\users\prince_singh04\anaconda3\lib\site-packages (from folium) (3.1.2) Requirement already satisfied: numpy in c:\users\prince_singh04\anaconda3\lib\site-packages (from folium) (1.24.3) Requirement already satisfied: requests in c:\users\prince_singh04\anaconda3\lib\site-packages (from folium) (2.31.0) Requirement already satisfied: xyzservices in c:\users\prince_singh04\anaconda3\lib\site-packages (from folium) (2022.9.0) Requirement already satisfied: MarkupSafe>=2.0 in c:\users\prince_singh04\anaconda3\lib\site-packages (from jinja2>=2.9->folium) (2.1.1) Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\prince_singh04\anaconda3\lib\site-packages (from requests->folium) (2.0.4) Requirement already satisfied: idna<4,>=2.5 in c:\users\prince_singh04\anaconda3\lib\site-packages (from requests->folium) (3.4) Requirement already satisfied: urllib3<3,>=1.21.1 in c:\users\prince_singh04\anaconda3\lib\site-packages (from requests->folium) (1.26.18) Requirement already satisfied: certifi>=2017.4.17 in c:\users\prince_singh04\anaconda3\lib\site-packages (from requests->folium) (2023.7.22) Downloading folium-0.17.0-py2.py3-none-any.whl (108 kB) ---------------------------------------- 0.0/108.4 kB ? eta -:--:-- ---------------------------------------- 0.0/108.4 kB ? eta -:--:-- --- ------------------------------------ 10.2/108.4 kB ? eta -:--:-- -------------- ------------------------ 41.0/108.4 kB 393.8 kB/s eta 0:00:01 ----------------------------------- -- 102.4/108.4 kB 737.3 kB/s eta 0:00:01 -------------------------------------- 108.4/108.4 kB 696.3 kB/s eta 0:00:00 Downloading branca-0.7.2-py3-none-any.whl (25 kB) Installing collected packages: branca, folium Successfully installed branca-0.7.2 folium-0.17.0
import pandas as pd
import folium
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import plotly.io as pio
pio.templates.default= "plotly_white"
metro_data= pd.read_csv("Delhi-Metro-Network.csv")
print(metro_data.head())
Station ID Station Name Distance from Start (km) Line \ 0 1 Jhil Mil 10.3 Red line 1 2 Welcome [Conn: Red] 46.8 Pink line 2 3 DLF Phase 3 10.0 Rapid Metro 3 4 Okhla NSIC 23.8 Magenta line 4 5 Dwarka Mor 10.2 Blue line Opening Date Station Layout Latitude Longitude 0 2008-04-06 Elevated 28.675790 77.312390 1 2018-10-31 Elevated 28.671800 77.277560 2 2013-11-14 Elevated 28.493600 77.093500 3 2017-12-25 Elevated 28.554483 77.264849 4 2005-12-30 Elevated 28.619320 77.033260
metro_data.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 285 entries, 0 to 284 Data columns (total 8 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Station ID 285 non-null int64 1 Station Name 285 non-null object 2 Distance from Start (km) 285 non-null float64 3 Line 285 non-null object 4 Opening Date 285 non-null object 5 Station Layout 285 non-null object 6 Latitude 285 non-null float64 7 Longitude 285 non-null float64 dtypes: float64(3), int64(1), object(4) memory usage: 17.9+ KB
Metro Network Analysis involves the application of data science techniques to understand and interpret the characteristics and dynamics of metro systems.
The provided dataset contains detailed information about the Delhi Metro network, one of the largest and busiest urban transit systems in the world. Key features of the dataset include:
- Station Information: Names and IDs of metro stations.
- Geographical Coordinates: Latitude and longitude of each station.
- Line Information: The specific metro line each station belongs to.
- Distance Data: The distance of each station from the start of its line.
- Station Layout: Type of station layout (e.g., Elevated, Underground, At-Grade).
- Opening Date: Date of inauguration of each station.
Your task is to:
- Map the stations to visualize the coverage and distribution of the metro network across Delhi.
- Examine characteristics of different metro lines, including station count and average distances between stations.
- Analyze the types of station layouts and their distribution across the network.
- Draw statistical correlations and insights, such as the relationship between station layout and distance from the city centre.
Now, let's have a look at whether the dataset has any null values or not and then look at the data types:
# checking for missing values
missing_values= metro_data.isnull().sum()
#checking data types
data_types= metro_data.dtypes
missing_values
Station ID 0 Station Name 0 Distance from Start (km) 0 Line 0 Opening Date 0 Station Layout 0 Latitude 0 Longitude 0 dtype: int64
data_types
Station ID int64 Station Name object Distance from Start (km) float64 Line object Opening Date object Station Layout object Latitude float64 Longitude float64 dtype: object
Now, I'll convert the Opening Data column to a datatime formate for ease of analysis:
# converting 'Opening Date' to Datetime formate
metro_data['Opening Date']= pd.to_datetime(metro_data['Opening Date'])
Geospatial Analysis¶
Now, I’ll start by visualizing the locations of the metro stations on a map. It will give us an insight into the geographical distribution of the stations across Delhi. We will use the latitude and longitude data to plot each station.
For this, I’ll create a map with markers for each metro station. Each marker will represent a station, and we’ll be able to analyze aspects like station density and geographic spread. Let’s proceed with this visualization:
# find the unique metro lines from the data set
metro_data['Line'].unique()
array(['Red line', 'Pink line', 'Rapid Metro', 'Magenta line',
'Blue line', 'Aqua line', 'Voilet line', 'Yellow line',
'Green line', 'Gray line', 'Orange line', 'Green line branch',
'Blue line branch'], dtype=object)
# defining a color scheme for the metro lines
line_colors= {
'Red line': 'red',
'Pink line': 'pink',
'Rapid Metro': 'cadetblue',
'Magenta line': 'darkred',
'Blue line': 'blue',
'Aqua line': 'black',
'Voilet line': 'purple',
'Yellow line': 'beige',
'Green line': 'green',
'Gray line': 'lightgray',
'Orange line': 'orange',
'Green line branch': 'lightgreen',
'Blue line branch': 'lightblue'
}
delhi_map_with_line_tooltip = folium.Map(location= [28.7041, 77.1025], zoom_start=11)
# adding colored markers for each metro station with line name in tooltip
for index, row in metro_data.iterrows():
line= row['Line']
color= line_colors.get(line, 'black') # Default color is black if line not found in the dictionary
folium.Marker(
location=[row['Latitude'],row['Longitude']],
popup=f"{row['Station Name']}",
tooltip =f"{row['Station Name']},{line}",
icon=folium.Icon(color=color)
).add_to(delhi_map_with_line_tooltip)
# Displaying the updated map
delhi_map_with_line_tooltip
Here is the map showing the geographical distribution of Delhi Metro stations. Each marker represents a metro station, and you can hover over or click on the markers to see the station name and the metro line it belongs to. This map provides a visual understanding of how the metro stations are spread across Delhi.
Now, I will analyze the growth of the Delhi Metro network over time. I’ll look at how many stations were opened each year and visualize this growth. It can provide insights into the pace of metro network expansion and its development phases.
I’ll start by extracting the year from the Opening Date and then count the number of stations opened each year. Following this, I’ll visualize this information in a bar plot. Let’s proceed with this analysis:
metro_data['Opening Year']= metro_data['Opening Date'].dt.year
# counting the number of stations opened each year
stations_per_year= metro_data['Opening Year'].value_counts().sort_index()
stations_per_year_df= stations_per_year.reset_index()
stations_per_year_df.columns= ['year', 'Number of Stations']
fig= px.bar(stations_per_year_df, x='year',y= 'Number of Stations',
title="Number of Metro Stations Opened Each Year in Delhi",
labels={'Year': 'Year', 'Number of Stations': 'Number of Stations Opened'})
fig.update_layout(xaxis_tickangle=-45, xaxis=dict(tickmode='linear'),
yaxis=dict(title='Number of Stations Opened'),
xaxis_title="Year")
fig.show()
The bar chart illustrates the number of Delhi Metro stations opened each year. This visualization helps us understand the temporal development of the metro network. Some key observations include:
- Some years show a significant number of new station openings, indicating phases of rapid network expansion.
- Conversely, there are years with few or no new stations, which could be due to various factors like planning, funding, or construction challenges.
Line Analysis¶
Now, I’ll analyze the various metro lines in terms of the number of stations they have and the average distance between stations. It will give us insights into the characteristics of each metro line, such as which lines are more extensive or denser.
I’ll calculate the number of stations per line and the average distance between stations on each line. I’ll then visualize these metrics to better understand the differences between the lines. Let’s start with these calculations:
stations_per_line= metro_data['Line'].value_counts()
# calculating the total distance of each metro line (max distance form start)
total_distance_per_line = metro_data.groupby('Line')['Distance from Start (km)'].max()
avg_distance_per_line= total_distance_per_line / (stations_per_line -1)
line_analysis= pd.DataFrame({
'Line': stations_per_line.index,
'Number of Stations': stations_per_line.values,
'Averages Distance Between Station (km)': avg_distance_per_line,
'Total Distance': total_distance_per_line
})
#sorting the DataFrame by the number of stations
line_analysis= line_analysis.sort_values(by= 'Number of Stations', ascending= False)
line_analysis.reset_index(drop= True, inplace= True)
print(line_analysis)
Line Number of Stations \
0 Blue line 49
1 Pink line 38
2 Yellow line 37
3 Voilet line 34
4 Red line 29
5 Magenta line 25
6 Aqua line 21
7 Green line 21
8 Rapid Metro 11
9 Blue line branch 8
10 Orange line 6
11 Gray line 3
12 Green line branch 3
Averages Distance Between Station (km) Total Distance
0 1.355000 27.1
1 1.097917 52.7
2 1.157143 8.1
3 1.950000 3.9
4 1.240000 24.8
5 1.050000 2.1
6 1.379167 33.1
7 4.160000 20.8
8 1.421622 52.6
9 1.000000 10.0
10 1.167857 32.7
11 1.318182 43.5
12 1.269444 45.7
The table presents a detailed analysis of the Delhi Metro lines, including the number of stations on each line and the average distance between stations.
To better understand these metrics, let’s visualize them. I’ll create two plots: one for the number of stations per line and another for the average distance between stations. It will provide a comparative view of the metro lines:
# creating subplots
fig = make_subplots(rows=1, cols=2, subplot_titles=('Number of Stations Per Metro Line',
'Average Distance Between Stations Per Metro Line'),
horizontal_spacing=0.2)
# plot for Number of Stations per Line
fig.add_trace(
go.Bar(y=line_analysis['Line'], x=line_analysis['Number of Stations'],
orientation='h', name='Number of Stations', marker_color='crimson'),
row=1, col=1
)
# plot for Average Distance Between Stations
fig.add_trace(
go.Bar(y=line_analysis['Line'], x=line_analysis['Averages Distance Between Station (km)'],
orientation='h', name='Average Distance (km)', marker_color='navy'),
row=1, col=2
)
# update xaxis properties
fig.update_xaxes(title_text="Number of Stations", row=1, col=1)
fig.update_xaxes(title_text="Average Distance Between Stations (km)", row=1, col=2)
# update yaxis properties
fig.update_yaxes(title_text="Metro Line", row=1, col=1)
fig.update_yaxes(title_text="", row=1, col=2)
# update layout
fig.update_layout(height=600, width=1200, title_text="Metro Line Analysis", template="plotly_white")
fig.show()
Station Layout Analysis¶
Next, I’ll explore the station layouts (Elevated, Ground Level, Underground). I’ll analyze the distribution of these layouts across the network and see if there are any patterns or trends, such as certain lines favouring a particular layout.
I’ll calculate the frequency of each layout type and then visualize these frequencies to get a clearer picture of the layout distribution. Let’s proceed with this:
layout_counts = metro_data['Station Layout'].value_counts()
# creating the bar plot using Plotly
fig = px.bar(x=layout_counts.index, y=layout_counts.values,
labels={'x': 'Station Layout', 'y': 'Number of Stations'},
title='Distribution of Delhi Metro Station Layouts',
color=layout_counts.index,
color_continuous_scale='pastel')
# updating layout for better presentation
fig.update_layout(xaxis_title="Station Layout",
yaxis_title="Number of Stations",
coloraxis_showscale=False,
template="plotly_white")
fig.show()
The bar chart and the counts show the distribution of different station layouts in the Delhi Metro network.
Observations:¶
- Elevated Stations: The majority of the stations are Elevated. It is a common design choice in urban areas to save space and reduce land acquisition issues.
- Underground Stations: The Underground stations are fewer compared to elevated ones. These are likely in densely populated or central areas where above-ground construction is less feasible.
- At-Grade Stations: There are only a few At-Grade (ground level) stations, suggesting they are less common in the network, possibly due to land and traffic considerations.
Summary¶
So, this is how you can perform Delhi Metro Network Analysis using Python. Metro Network Analysis involves examining the network of metro systems to understand their structure, efficiency, and effectiveness. It typically includes analyzing routes, stations, traffic, connectivity, and other operational aspects.